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Rhizobium leguminosarum bv. trifolii strain TA1 is an aerobic, motile, Gram-negative, non- 
spore-forming rod that is an effective nitrogen fixing microsymbiont on the perennial clovers 
originating from Europe and the Mediterranean basin. TA1 however is ineffective with many 
annual and perennial clovers originating from Africa and America. Here we describe the fea- 
tures of R. leguminosarum bv. trifolii strain TA1, together with genome sequence information 
and annotation. The 8,618,824 bp high-quality-draft genome is arranged in a 6 scaffold of 32 
contigs, contains 8,493 protein-coding genes and 83 RNA-only encoding genes, and is one of 
20 rhizobial genomes sequenced as part of the DOE Joint Genome Institute 2010 Community 
Sequencing Program. 



Introduction 



Biological fixation of inert atmospheric dinitrogen 
gas is a process that can only be performed by cer- 
tain prokaryotes in the domains Archctea and Bac- 
teria. By far the greatest amounts of nitrogen (N) 
are fixed by specialized soil bacteria (root nodule 
bacteria or rhizobia) that form proto-cooperative, 
non-obligatory symbiotic relationships with leg- 
umes [1]. Indeed, these symbioses contribute ~40 
million tonnes of N annually to support global 
food production [2]. 



regions [3]. Globally important perennial species 
of clover include T. repens (white clover), T. 
pratense (red clover), T. fragiferum (strawberry 
clover) and T. hybridum (alsike clover). Clovers 
usually form N2-fixing symbioses with the com- 
mon soil bacterium Rhizobium leguminosarum bv. 
trifolii, and different combinations of Trifolium 
hosts and strains of R. leguminosarum bv. trifolii 
can vary markedly in symbiotic compatibility [4], 
resulting in a broad range of symbiotic develop- 
mental outcomes ranging from ineffective (non- 
nitrogen fixing) nodulation to fully effective In- 
fixing partners hips [5]. 



Species of the legume genus Trifolium (clovers) 
are amongst the most widely cultivated pasture 
legumes. Naturally, this genus inhabits three dis- 
tinct centers of diversity with approximately 28% 
of species in the Americas, 57% in Eurasia and 
15% in Sub-Saharan Africa [3]. A smaller subset of 
about 30 species, almost all of Eurasian origin, are 
widely gown as annual and perennial species in 
pasture systems in Mediterranean and temperate 



In Australia, Rhizobium leguminosarum bv. trifolii 
strain TA1 (initially designated BA-Tas) has a long 
history of use as a commercial inoculant for Trifo- 
lium spp. [6]. TA1 was originally isolated from a 
root nodule on the annual species T. 
subterraneaum in Bridport, Tasmania in the early 



BY-SA 



The Genomic Standards Consortium 



Rhizobium legum inosarum bv. trifolii strain TA1 



1950's [6]. This isolate is likely to be a naturalized 
strain of European origin that arrived by chance in 
Tasmania in the 1800's. Although widely used as a 
microsymbiont of European clovers, it became 
evident that this soil saprophyte is not acid toler- 
ant [7] and survives poorly when coated onto clo- 
ver seed with a peat based carrier [8-10]. Never- 
theless, TA1 remains the commercial inoculant in 
Australia for perennial [T repens, T. pratense, T. 
fragiferum, T. hybridum, T. tumens (talish clover)) 
and annual (T. alexandrinum (berseem clover), T. 
glomeratum (cluster clover) and T. dubium (suck- 
ling clover)) clovers of European origin [11]. Fur- 
thermore, this R. leguminosarum bv. trifolii strain 
has been adopted by the international community 
as a model organism to investigate the biology of 
the Trifolium-Rhizobium symbiosis [12]. Here we 
present a summary classification and a set of gen- 
eral features for R. leguminosarum bv. trifolii 
strain TA1 together with the description of the 
complete genome sequence and its annotation. 

Classification and general features 

R. leguminosarum bv. trifolii strain TA1 is a motile, 
Gram-negative, non-spore-forming rod (Figure 1 
Left and Center) in the order Rhizobiales of the 
class Alphaproteobacteria. It is slow growing, 
forming 1-4 mm diameter colonies within 3-5 
days grown on half Lupin Agar (^LA) [13] at 
28°C. Colonies on YzLA are white-opaque, slightly 
domed, moderately mucoid with smooth margins 
(Figure 1 Right). Minimum Information about the 
Genome Sequence (MIGS) is provided in Table 1. 
Figure 2 shows the phylogenetic neighborhood of 
R. leguminosarum bv. trifolii strain TA1 in a 16S 
rRNA sequence based tree. This strain clusters 
closest to R leguminosarum bv. trifolii T24 and R. 



leguminosarum bv. phaseoli RRE6 with 99.9% and 
99.8% sequence identity, respectively. 

Symbiotaxonomy 

Rhizobium leguminosarum bv. trifolii strain TA1 is 
currently the commercial inoculant for white (Trz- 
folium repens), red [Trifolium pratense] and 
strawberry [Trifolium fragiferum) clovers in Aus- 
tralia. TA1 in general is not as effective for nitro- 
gen fixation on annual clovers as other strains, 
such as WSM1325 [34,35]. However TA1 is of par- 
ticular interest because it displays a broad host 
range for nodulation and nitrogen fixation across 
annual and perennial clovers originating from the 
European and Mediterranean centre of origin of 
clovers [1]. TA1 is generally able to nodulate but 
unable to fix with many annual and and perennial 
clovers originating from Africa and America [34]. 

Genome sequencing and annotation 
information 

Genome project history 

This organism was selected for sequencing on the 
basis of its environmental and agricultural rele- 
vance to issues in global carbon cycling, alterna- 
tive energy production, and biogeochemical im- 
portance, and is part of the Community Sequenc- 
ing Program at the U.S. Department of Energy, 
Joint Genome Institute (JGI) for projects of rele- 
vance to agency missions. The genome project is 
deposited in the Genomes OnLine Database [33] 
and an improved-high-quality-draft genome se- 
quence in IMG. Sequencing, finishing and annota- 
tion were performed by the JGI. A summary of the 
project information is shown in Table 2. 




Figure 1. Images of Rhizobium leguminosarum bv. trifolii strain TA1 using scanning (Left) and transmission (Cen- 
ter) electron microscopy as well as light microscopy to visualize colony morphology on solid media (Right). 
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Table 1. Classification and general features of Rhizobium leguminosarum bv. trifolii strain TA1 according to the 
MIGS recommendations [14]. 



MIGS ID 


Property 


Term 


Evidence code 
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Class Alphaproteobacteria 
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Order Rhizobiales 
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Genus Rhizobium 


TAS [20,22-25] 




Current classification 


Species Rhizobium l&guminosarum bv. trifolii 


TAS [20,22,25,26] 
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Rod 


TAS [2 7] 




Moti 1 ity 


Moti le 


TAS [2 7] 




Sporulation 


Non-sporulating 


TAS [2 7] 




Temperature range 


Mesophile 


TAS [2 7] 




Optimum temperature 


28°C 


TAS [2 7] 
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Biotic relationship 
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IDA 


MIGS-5 


Nodule collection date 


1953 


IDA 


MIGS-4.1 


Longitude 


147.667 


IDA 


MIGS-4.2 


Latitude 


-41.0335 


IDA 


MIGS-4. 3 


Depth 


Not recorded 




MIGS-4.4 


Altitude 


Not recorded 





Evidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists 
in the literature). These evidence codes are from the Gene Ontology project [30]. 
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Rhizobium soli DS-42 T (EF36371 5) 
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Mesorhizobium loti USDA 3471 T (X67229, Gi08881) 

Mesorhizobium septentrionale SDW 014 T (AF508207) 
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100 

53 
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0.01 



Figure 2. Phylogenetic tree showing the relationship of Rhizobium leguminosarum bv. trifolii strain TA1 
(shown in blue print) with some of the root nodule bacteria in the order Rhizobiales based on aligned se- 
quences of the 1 6S rRNA gene (1,307 bp internal region). All sites were informative and there were no gap- 
containing sites. Phylogenetic analyses were performed using MEGA, version 5.05 [31]. The tree was built 
using the maximum likelihood method with the General Time Reversible model. Bootstrap analysis [32] with 
500 replicates was performed to assess the support of the clusters. Type strains are indicated with a super- 
script T. Strains with a genome sequencing project registered in GOLD [33] are in bold print and the GOLD 
ID is mentioned after the accession number. Published genomes are designated with an asterisk. 
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Table 2. Genome sequencing project information for Rhizobium leguminosarum bv. trifolii strain T A1 . 



MIGS ID 


Property 


Term 


MIGS-31 


Finishing quality 


Improved high-quality draft 


MIGS-28 


Libraries used 


lllumina GAii shotgun and paired end 454 libraries 


MIGS-29 


Sequencing platforms 


lllumina GAii and 454 GS FLX Titanium technologies 


MIGS-31. 2 


Sequencing coverage 


7.8x 454 paired end, 764. 2x lllumina 


MIGS-30 


Assemblers 


Velvet 1.0.13, Newbler2.3, phrap 4.24 


MIGS-32 


Gene calling methods 


Prodigal 1.4, GenePRIMP 




GOLD ID 


Gi0648 




NCBI project ID 


63831 




Database: IMG 


2510461076 




Project relevance 


Symbiotic N 7 fixation, agriculture 



Growth conditions and DNA isolation 

Rhizobium leguminosarum bv. trifolii strain TA1 
was grown to mid logarithmic phase in TY rich 
media [36] on a gyratory shaker at 28°C. DNA was 
isolated from 60 ml of cells using a CTAB (Cetyl 
trimethyl ammonium bromide) bacterial genomic 
DNA isolation method [37]. 

Genome sequencing and assembly 

The genome of Rhizobium leguminosarum bv. 
trifolii strain TA1 was sequenced at the Joint Ge- 
nome Institute (JGI) using a combination of 
lllumina [38] and 454 technologies [39]. An 
lllumina GAii shotgun library which generated 
66,421,308 reads totaling 5,048 Mb, and a paired 
end 454 library with an average insert size of 13 
kb which generated 393,147 reads totaling 100.1 
Mb of 454 data were generated for this genome. 
All general aspects of library construction and 
sequencing performed at the JGI can be found at 
the JGI user homepage [40]. The initial draft as- 
sembly contained 199 contigs in 5 scaffolds. The 
454 paired end data was assembled with 
Newbler, version 2.3. The Newbler consensus 
sequences were computationally shredded into 2 
kb overlapping fake reads (shreds). lllumina se- 
quencing data were assembled with VELVET, 
version 1.0.13 [41], and the consensus sequence 
were computationally shredded into 1.5 kb over- 
lapping fake reads (shreds). We integrated the 
454 Newbler consensus shreds, the lllumina 
VELVET consensus shreds and the read pairs in 



the 454 paired end library using parallel phrap, 
version SPS - 4.24 (High Performance Software, 
LLC). The software Consed [42-44] was used in 
the following finishing process. lllumina data was 
used to correct potential base errors and in- 
crease consensus quality using the software Pol- 
isher developed at JGI (Alia Lapidus, un- 
published). Possible mis-assemblies were cor- 
rected using gapResolution (Cliff Han, un- 
published), Dupfinisher (Han, 2006), or sequenc- 
ing cloned bridging PCR fragments with 
subcloning. Gaps between contigs were closed by 
editing in Consed, by PCR and by Bubble PCR (J-F 
Cheng, unpublished) primer walks. A total of 275 
additional reactions were necessary to close gaps 
and to raise the quality of the finished sequence. 
The estimated genome size is 7.6 Mb and the final 
assembly is based on 65.3 Mb of 454 draft data 
which provides an average of 8.6* coverage of 
the genome and 4,864.7 Mb of lllumina draft data 
which provides an average 640. 1* coverage of 
the genome. 

Genome annotation 

Genes were identified using Prodigal [45] as part 
of the DOE-JGI Annotation pipeline [46], followed 
by a round of manual curation using the JGI 
GenePRIMP pipeline [47]. The predicted CDSs 
were translated and used to search the National 
Center for Biotechnology Information (NCBI) non- 
redundant database, UniProt, TIGRFam, Pfam, 
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PRIAM, KEGG, COG, and InterPro databases. These 
data sources were combined to assert a product 
description for each predicted protein. Non- 
coding genes and miscellaneous features were 
predicted using tRNAscan-SE [48], RNAMMer [49], 
Rfam [50], TMHMM [51], and SignalP [52]. Addi- 
tional gene prediction analyses and functional an- 
notation were performed within the Integrated 
Microbial Genomes (IMG-ER) platform [37,53]. 



Genome properties 

The genome is 8,618,824 nucleotides with 60.74% 
GC content (Table 3) and comprised of 32 contigs 
in 6 scaffolds (Figure 3). From a total of 8,576 
genes, 8,493 were protein encoding and 83 RNA 
only encoding genes. The majority of genes 
(77.85%) were assigned a putative function whilst 
the remaining genes were annotated as hypothet- 
ical. The distribution of genes into COGs functional 
categories is presented in Table 4. 



Table 3. Genome sequencing project information for Rhizobium leguminosarum bv. 
trifolii strain SRDI943. 



Attribute 



Value 



% of Total 



Genome size (bp) 



DNA coding region (bp) 



DNA G+C content (bp) 



Number of scaffolds 



Number of contigs 



Total genes 



RNA genes 



rRNA operons* 



Protein-coding genes 



Genes with function prediction 



Genes assigned to COGs 



Genes assigned Pfam domains 



Genes with signal peptides 



Genes with transmembrane helices 



CRISPR repeats 



8,618,824 
7,407,820 
5,234,677 
6 
32 
8,576 
83 
1 

8,493 
6,676 
6,673 
6,944 
72 7 
1,897 
0 



100.00 
85.95 
60.74 



100.00 
0.97 
0.01 
99.03 
77.85 
77.81 
80.97 
8.48 
22.12 



s 1 copy of 23S rRNA, 2 copies of 16S and 2 copies of 5S rRNA genes 
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Figure 3. Graphical linear map of the genome of Rhizobium leguminosarum bv. trifolii strain 
TA1. From outside to the center: Genes on forward strand (color by COG categories), Genes 
on reverse strand (color by COG categories), RNA genes (tRNAs green, sRNAs red, other 
RNAs black), GC content, GC skew. 
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Table 4. Number of protein coding genes of Rhizobium leguminosarum bv. trifolii TA1 as- 
sociated with the general COG functional categories. 

Code Value %age COG Category 



J 


247 


3.29 


Translation, ribosomal structure and biogenesis 


A 


1 


0.01 


RNA processing and modification 


K 


751 


10.01 


Transcription 


L 


317 


4.23 


Replication, recombination and repair 


B 


3 


0.04 


Chromatin structure and dynamics 


D 


44 


0.59 


Cell cycle control, mitosis and meiosis 


Y 


0 


0.00 


Nuclear structure 


V 


92 


1.23 


Defense mechanisms 


T 


402 


5.36 


Signal transduction mechanisms 


M 


365 


4.87 


Cell wall/membrane biogenesis 


N 


100 


1.33 


Cell motility 


Z 


2 


0.03 


Cytoskeleton 


W 


0 


0.00 


Extracellular structures 


u 


114 


1.52 


Intracellular trafficking and secretion 


o 


217 


2.89 


Posttranslational modification, protein turnover, chaperones 


c 


384 


5.12 


Energy production conversion 


G 


746 


9.95 


Carbohydrate transport and metabolism 


E 


803 


10.71 


Amino acid transport metabolism 


F 


134 


1.79 


Nucleotide transport and metabolism 


H 


235 


3.13 


Coenzyme transport and metabolism 


I 


271 


3.61 


Lipid transport and metabolism 


p 

r 






1 n r~»rn a ni i n tea ncnnrt a nri m ot a t~\\ i c m 
1 1 1UI i£ dl II K, ItJII UcIIIjIJLMI dlHJ II iCldU Dl 1 3l 1 1 


Q 


201 


2.68 


Secondary metabolite biosynthesis, transport and catabolism 


R 


976 


13.02 


General function prediction only 


S 


720 


9.60 


Function unknown 




1,903 


22.19 


Not in COGS 
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